Auto-parallelisation of Sieve C++ Programs

نویسندگان

  • Alastair F. Donaldson
  • Colin Riley
  • Anton Lokhmotov
  • Andrew Cook
چکیده

We describe an approach to automatic parallelisation of programs written in Sieve C++ (Codeplay’s C++ extension), using the Sieve compiler and runtime system. In Sieve C++, the programmer encloses a performance-critical region of code in a sieve block, thereby instructing the compiler to delay sideeffects until the end of the block. The Sieve system partitions code inside a sieve block into independent fragments and speculatively distributes them among multiple cores. We present implementation details and experimental results for the Sieve system on the Cell BE processor.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Delayed Side-Effects Ease Multi-core Programming

Computer systems are increasingly parallel and heterogeneous, while programs are still largely written in sequential languages. The obvious suggestion that the compiler should automatically distribute a sequential program across the system usually fails in practice because of the complexity of dependence analysis in the presence of aliasing. We introduce the sieve language construct which facil...

متن کامل

Strict and Relaxed Sieving for Multi-Core Programming

In Codeplay’s Sieve C++, the programmer can place code inside a “sieve block” thereby instructing the compiler to delay writes to global memory and apply them in order on exit from the block. The semantics of sieve blocks makes code more amenable to automatic parallelisation. However, strictly queueing writes until the end of a sieve block incurs overheads and is typically unnecessary. If the p...

متن کامل

Parallelisation of the Model-based Iterative Reconstruction Algorithm Dira.

New paradigms for parallel programming have been devised to simplify software development on multi-core processors and many-core graphical processing units (GPU). Despite their obvious benefits, the parallelisation of existing computer programs is not an easy task. In this work, the use of the Open Multiprocessing (OpenMP) and Open Computing Language (OpenCL) frameworks is considered for the pa...

متن کامل

Influence of the Sparse Matrix Structure on Automatic Parallelisation Efficiency

The simulated models and requirements of engineering programs like computational fluids dynamics and structural mechanics grow more rapidly than single processor performance. Automatic parallelisation seem to be the obvious approach for huge and historic packages like PERMAS. In this paper we evaluate how preparatory steps on the big input matrices can improve the performance of the parallelisa...

متن کامل

Towards Automatic Parallelisation for Multi-Processor DSPs

This paper describes a preliminary compiler based approach to achieving high performance DSP applications by automatically mapping C programs to multi-processor DSP systems. DSP programs typically contain pointer based memory accesses making automatic parallelisation difficult. This paper presents a new method to convert a restricted class of pointer-based memory accesses into array accesses wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007